
STOP 



Early Journal Content on JSTOR, Free to Anyone in the World 

This article is one of nearly 500,000 scholarly works digitized and made freely available to everyone in 
the world by JSTOR. 

Known as the Early Journal Content, this set of works include research articles, news, letters, and other 
writings published in more than 200 of the oldest leading academic journals. The works date from the 
mid-seventeenth to the early twentieth centuries. 

We encourage people to read and share the Early Journal Content openly and to tell others that this 
resource exists. People may post this content online or redistribute in any way for non-commercial 
purposes. 

Read more about Early Journal Content at http://about.jstor.org/participate-jstor/individuals/early- 
journal-content . 



JSTOR is a digital library of academic journals, books, and primary source objects. JSTOR helps people 
discover, use, and build upon a wide range of content through a powerful research and teaching 
platform, and preserves this content for future generations. JSTOR is part of ITHAKA, a not-for-profit 
organization that also includes Ithaka S+R and Portico. For more information about JSTOR, please 
contact support@jstor.org. 



235 

In addition to the subjects mentioned above, mental arithmetic, the 
abacus arithmetic with the use of the soroban, and the geometrical loci and 
problems of construction are taught at convenient times. The soroban is an 
abacus that was introduced from China. Its introduction is believed by 
some writers to have been about 1600, but it may have been made at a far 
earlier period. However, it was since the middle of the seventeenth cen- 
tury that it was popularized in Japan. Although there was also another 
sort of abacus, the sangi, or calculating pieces, the sole help in daily use for 
calculations was the soroban', for the sangi were too cumbersome and better 
adapted to more complicated calculations. Even since the introduction of 
the Occidental style of calculation in the middle part of the last century, the 
soroban has not entirely disappeared, and it is still widely used. This is the 
reason why the soroban arithmetic is taught at present in primary schools, 
together with the Occidental arithmetic. 

The adoption of the practical side of mathematical teaching in normal 
schools will certainly be against the wishes of those who insist that begin- 
ners should have theoretical instruction only. But this plan appears to 
prove successful in training young minds to the assimilation of mathematic- 
al ideas; it is especially in agreement with the development of the Japanese 
character which always looks towards the practical. 

It is understood that the new course is not free from faults, but it is 
recommended to the teacher as a standard, and it is believed that the skill 
and the power of adaptation already displayed by the Japanese in so many 
directions will enable them to improve upon it and to develop eventually a 
still more complete and satisfactory plan. It is worthy of note that Japan 
is engaged in this development at the same time that all the peoples in the 
civilized world seem to be considering the possible improvements in the 
teaching of mathematics. 



ON A MEAN DIFFERENCE PROBLEM THAT OCCURS 
IN STATISTICS. 



By H. L. RIETZ, University of Illinois. 



1. Introduction. In making a comparison of the differences between 
the highest and lowest examination marks of a pupil in a given subject, with 
the corresponding difference for the same pupil in some other subjects, E. G. 
Dexter dealt with data such that, in one subject, say in mathematics, each 
pupil had ten distinct marks, while in another subject, say in Latin, 
each pupil had only two or three such marks. In this case, it seems reason- 
able to expect that, other things being equal, the extreme marks in mathe- 
matics would tend to differ more that the extreme marks in Latin. 



236 

In considering the question of comparing the average values of the 
differences between these extreme marks, for a large number of pupils, 
Dexter proposed to me the following problem: 

An urn contains 101 counters marked, 0, 1, 2, ..., 100. Drawings are 
made at random taking r at a time, always replacing the counters before 
drawing again. Find the mean difference between numbers on the counters 
taken two at a time, and the mean difference between the most extreme 
numbers on counters taken 3, 4, 5, ..., r at a time. 

I present the solution here not merely because it seems to be a special 
problem of some interest, but mainly because of its relation to a general 
problem in statistics, known as Galton's Difference Problem.* 

It may be worth remarking here that the solution of Galton's Differ- 
ence Problem offers, as a special result, an answer to the question of the most 
suitable ratio between the values of first and second prizes in a competition. 

2. To solve the problem proposed by Dexter, let us define the mean 
difference as the sum of the products of each separate difference by the 
probability of its occurrence. 

a) The case of drawing two counters at a time. 

We may form a difference of 100 in one way, of 99 in two ways, of 98 
in three ways, and so on. Hence the mean value is 



100 

M 100.1+99.2+.. . +1.100 ^ i x ( m ~ x ^ 

1+2+.., +100 ■ ■-■ ioo - rf4 - 

i 

b) The case of drawing three counters at a time. 

We may form a difference of 100 between extremes in 1X99 ways, of 
99 in 2X98 ways, of 98 in 3X97 ways, and so on. Hence, the mean differ- 
ence between extremes is 



M = 10 °- "• 1 +99- 98- 2 + - + 2. 1. 99 Sx(101 x) (100 x ) 

99. 1+98. 2+... +1.99 % /1AA , 5L 

2 a; (100— x) 
i 

c) The case of drawing r counters at a time. 

We may form a difference of 100 in IX99CV-2 ways, of 99 in 2X<%C r -2, 
of 98 in 3X97C,— 2 ways, and so on. From this, we find, by a very slight re- 
duction, that the mean difference between extreme numbers on the r 
counters is given by 

"Pearson, Biometrika, Vol. I, pp. 390-399. 



237 



102— r 



M-- 



2 (101-a>)x(100-a;) (99-*)... (103-r-x) 

. _i ^ • 

102-r 

2 x(100-x)(99-a)...(103-r-a0 
i 



To evaluate this expression for any special value of r requires only 
the sum of series that are integral powers of the natural numbers; that is, 
of the series r+2 s +3 s +...+re s . 

If, corresponding to each mark from to 100, there are a. fixed num- 
ber t of counters in the urn instead of one only, the problem is solved by a 
slight extension of the above; since, when t> r, there are a certain number 
of ways in which r counters with equal numbers may be drawn. 

On the side of statistical applications, obviously data arranged in fre- 
quency groups with respect to some character rarely even approximate to 
constant values at equal intervals as suggested by the counters in the above 
problem. For example, examination marks on a percentage basis are, in 
general, much more frequent at some intervals of the range from to 100 
than at others. Thus, as an illustration, the examination marks of a 
certain group of 1255 students in foreign languages, may be exhibited 
in frequency groups as follows: 



Intervals: 50-52.5 
Frequency: 1 



52.5-57.5 
1 



57.5-62.5 
2 



62.5-67.5 
6 



67.5-72.5 
20 



72.5-77.5 
89 



77.5-82.5 
195 



82.5-87.5 
283 



87.5-92.5 
323 



92.5-97.5 
307 



97.5- 

28 



This arrangement of a totality with respect to some character is called a fre- 
quency distribution. It is a reasonable assumption that the frequency 
of occurrence follows some law of probability; and, to make the matter fair- 
ly simple, we might perhaps assume an ideal system of grading such that 
the frequency of marks from to 100 should be proportional to the 
101, terms in the expansion of (p+q) 100 where p+q=l. But the problem of 
the mean difference between extreme numbers in drawing numbered 
counters with such a law of distribution becomes very complicated. On this 
account, we seek an analytic treatment that will bring in functions to which 
we can apply the calculus in performing summations. 

3. Continuous treatment of mean differences. The problem proposed 
by Dexter has a continuous analogue in the following: 

On a line segment AB of length I, r points x lt x t , ..., x r> are selected 
at random. Find the mean value of [ x r — x x \ where x x and x r denote the 
extremes of the r points. 



238 

For the simple case of two points, this problem is solved in some 
standard books* on the theory of probability. 

It is assumed for this problem that the probability of a point selected 
belonging to a constant interval dx of the line AB is the same no matter 
where the interval is chosen. 

Let x be the abscissa of any point on I and x that of any point, where 
x>x'. 'Then the probability of a point first selected belonging to an interval 
d#', the second selected to &x, and the remaining r— 2 to the interval between 
x and * is 

dx dx I x—x'\ r ~~ z 



i • r\ i 



But there are r(r— 1) ways to select a group of 1, 1 and r— 2 out of r 
things. Hence, we have for the mean difference 

ri r« d x ' Bx/x-x'V-? ,s r-1, 



This result gives, for a constant and continuous distribution from to 
100, a mean difference of 33J, when selected in sets of two to compare with 
34 for a distribution of a single individual at intervals of one unit, 
and a mean difference of 50 to compare with 51 when selections are made 
three at a time. 

4. Galton's Difference Problem. Let us consider a class of objects and 
let some numerical mark be attached to the individuals of this class. For exam- 
ple, we may think of statistical data representing the statures of men of a cer- 
tain class, or the examination marks of a group of pupils. All that is essen- 
tial is a totality each individual of which is marked by some number. Let 
the marks be represented as abscissas of points on the sc-axis, and let us as- 
sume that a function y=f(x) exists such that ydx gives the probability that 
a point, located by the mark of an individual taken at random from 
the totality, belongs to the interval from x to x+d%. If a sample of 
n be selected at random out of the totality, we set the problem of finding the 
mean difference between the pth and (p+l)th individuals of the sample, 
when the n individuals of the sample are arranged in order of magnitude 
with respect to the character. 

This is Galton's Difference Problem, and we need only add such differ- 
ences from p=l to p=n—l to solve the problem of the mean difference be- 
tween extremes in samples of n. 

*Czuber, Wahrseheinlichkeitsrechnung, 1903, p. 76. 
Borel, Elements de la Tfigorie de% Probability, p. 97. 



239 

Pearson stated Galton's Problem in its general form, and obtained* 
for the mean difference between the pth and (p+l)th individual 

D '=Vi4)U>iS t -T' (1 -" ) ' i '--- (1) ' 

where «= } f(x)dx ... (2). 

The problem is thus reduced to one of quadrature. Suppose f(x) is 

the probability function of Gauss, written in the form f(x) •— /Q e~ *>* ' 2 °* ; 

then, tables of values of « with the argument x are easily accessible. In 
this case, by quadration, from (1), using Simpson's two-thirds rule and in- 
tervals of 0.2 «•, I have evaluated D p for n=2 and %=10. The values for 
n~Z are given by Pearson. The results are: 

For n=2, D 1 =1.13 «r,t 

n=S, Z) 1 =0.846t, D 2 =0.846*, 

%=10, D^O.540*, Z) 2 =0.346*, I> 3 =0.282*, D 4 =0.253, 

Z> 6 =0.247, Z> 6 =0.253, Z> 7 =0.282*, Z> 8 =0.346*, Z> 9 =0.540*. 

Let £7*=the mean difference of extreme values when samples of n are 
taken, then 

£72=1.13 «, 

#3=1.69*, ... (3). 
£7,0=3.09*. 

To indicate the significance of these results, we may say that 
in selecting at random a sample of 10 from a Gaussian distribution, we have 

a mean difference between extremes of ^^=2.73 times as much as when 

O AQ 

we take only two at a time, and 1^5 =1- 82 times as much as when we take 

three at a time. 

To apply these results to a reasonable distribution for examination 
marks, suppose that marks range from 50 to 100, and that the frequencies 
of occurrence of given marks 50, 51, 52, ... are proportional to the terms in 

*Loc. cit., p. 392. 

*The last figure in thesa calculations may be of doubtful value. 



240 

the expansion of (i+i) 60 . Then the distribution is well described by 
Gaussian function in which 

ff=v /(50Xjx£)=^p. 

In this case, 

# 2 =3.99, 
# 3 =5.97, 
#,0=10.92. 



The results (3) give a precise notion as to the values of the mean dif- 
ference between extreme individuals of a small sample taken at random 
from a totality that is distributed in accord with Gauss's law. 

Since many frequency distributions not well described by Gauss's 
law are well described by its generalizations, it seems likely that results, 
such as concern us here, when derived from Gauss's curve apply at least 
roughly to many more general types of frequency distributions that occur in 
statistics. To illustrate, the distribution of examination marks given on p. 
237 is not at all well fitted by numbers proportional to the terms in the ex- 
pansion of (i+i) 50 ; but, if we use the ungraduated values there given to 
evaluate (1) by quadrature, we obtain, for the mean difference in. taking 
sets of two, the value 7.71. But, if we compute « as the square root of the 
mean square of the deviations of observations from their mean value, the re- 
sult is 7.42. Then, from (3), # 2 =8.38 to compare with 7.71; and, hence, 
the result from the assumption of the law of Gauss gives at least a good 
general notion of the mean difference in question. 



DEPARTMENTS. 



SOLUTIONS OF PROBLEMS. 



ALGEBRA. 



842. Proposed by E. B. ESCOTT, Ann Arbor, Michigan. 



Prove that 1 2 „ . + 5 „ „ g +. . . =ilog2— A*. [Hobson's Plane Trig- 
onometry, page 348.] 



